A statistical model for robust integration of narrowband cues in speech

نویسندگان

  • Lawrence K. Saul
  • Mazin G. Rahim
  • Jont B. Allen
چکیده

We investigate a statistical model for integrating narrowband cues in speech. The model is inspired by two ideas in human speech perception: (i) Fletcher’s hypothesis (1953) that independent detectors, working in narrow frequency bands, account for the robustness of auditory strategies, and (ii) Miller and Nicely’s analysis (1955) that perceptual confusions in noisy bandlimited speech are correlated with phonetic features. We apply the model to detecting the phonetic feature [+/−sonorant] that distinguishes vowels, approximants, and nasals (sonorants) from stops, fricatives, and affricates (obstruents). The model is represented by a multilayer probabilistic network whose binary hidden variables indicate sonorant cues from different parts of the frequency spectrum. We derive the Expectation-Maximization algorithm for estimating the model’s parameters and evaluate its performance on clean and corrupted speech. c © 2001 Academic Press

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust Bandwidth Extension of Noise-co

We present a new bandwidth extension algorithm for converting narrowband telephone speech into wideband speech using a transformation in the mel cepstral domain. Unlike previous approaches, the proposed method is designed specifically for bandwidth extension of narrowband speech that has been corrupted by environmental noise. We show that by exploiting previous research in mel cepstrum feature ...

متن کامل

Statistical methods for incomplete speech data

Aalto University, P.O. Box 11000, FI-00076 Aalto www.aalto.fi Author Ulpu Remes Name of the doctoral dissertation Statistical methods for incomplete speech data Publisher School of Electrical Engineering Unit Department of Signal Processing and Acoustics Series Aalto University publication series DOCTORAL DISSERTATIONS 149/2016 Field of research Speech and Language Technology Manuscript submitt...

متن کامل

Effects of Peripheral Tuning on the Auditory Nerve’s Representation of Speech Envelope and Temporal Fine Structure Cues

Abstract A number of studies have explored how speech envelope and temporal fine structure (TFS) cues contribute to speech perception. Some recent investigations have attempted to process speech signals to remove envelope cues and leave only TFS cues, but the results are confounded by the fact that envelope cues may be partially reconstructed when TFS signals pass through the narrowband filters...

متن کامل

Robust detection of phonetic features incritical bands

We consider how to detect phonetic features in noisy bandlimited speech. We propose an automatic method based on the hypothesis that independent feature detectors, working in parallel, account for the robustness of auditory strategies. Our method consists of three stages: rst, speech is ltered into critical bands and enhanced by nonlinearities; second, pho-netic cues are derived from narrowband...

متن کامل

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computer Speech & Language

دوره 15  شماره 

صفحات  -

تاریخ انتشار 2001